Graph-embedding for speaker recognition
نویسندگان
چکیده
Popular methods for speaker classification perform speaker comparison in a high-dimensional space [1], however, recent work [2] has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding [3] as a proxy to the manifold and show the use of the embedding in data visualization and exploration. ISOMAP [3] will be used to explore the existence and dimension of the space. We also examine whether the manifold assumption can help in two classification tasks: data-mining and standard NIST speaker recognition evaluations (SRE) [4]. Our results show that the data lives on a manifold and that exploiting this structure can yield significant improvements on the data-mining task. The improvement in preliminary experiments on all trials of the NIST SRE Eval-06 core task are less but significant.
منابع مشابه
A Novel Classifier Based on Enhanced Lipschitz Embedding for Speech Emotion Recognition
The paper proposes a novel classifier namedELEC (Enhanced Lipschitz Embedding based Classifier) in the speech emotion recognition system. ELEC adopts geodesic distance to preserve the intrinsic geometry of speech corpus and embeds the high dimensional feature vector into a lower space. Through analyzing the class labels of the neighbor training vectors in the compressed space, ELEC classifies t...
متن کاملSpeaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل